model (GLM).

Don’t confuse the generalized linear model with the very similarly named general linear

model. It’s unfortunate that these two names are almost identical, because they describe two very

different things. Now, the general linear model is usually abbreviated LM, and the generalized

linear model is abbreviated GLM, so we will use those abbreviations. (However, some old

textbooks from the 1970s may use GLM to mean LM, because the generalized linear model had

not been invented yet.)

GLM is similar to LM in that the predictor variables usually appear in the model as the familiar linear

combination:

where the x’s are the predictor variables, and the c’s are the regression coefficients (with

being

called a constant term, or intercept).

But GLM extends the capabilities of LM in two important ways:

With LM, the outcome is assumed to be a continuous, normally distributed variable. But with

GLM, the outcome can be continuous or an integer. It can follow one of several different

distribution functions, such as normal, exponential, binomial (as in logistic regression), or Poisson.

With LM, the linear combination becomes the predicted value of the outcome, but with GLM, you

can specify a link function. The link function is a transformation that turns the linear combination

into the predicted value. As we note in Chapter 18, logistic regression applies exactly this kind of

transformation: Let’s call the linear combination V. In logistic regression, V is sent through the

logistic function

to convert it into a predicted probability of having the outcome

event. So if you select the correct link function, you can use GLM to perform logistic regression.

GLM is the Swiss army knife of regression. If you select the correct link function, you can use

it to do ordinary least-squares regression, logistic regression, Poisson regression, and a whole lot

more. Most statistical software offers a GLM function; that way, other specialized regressions

don’t need to be programmed. If the software you are using doesn’t offer logistic or Poisson

regression, check to see whether it offers GLM, and if it does, use that instead. (Flip to Chapter 4

for an introduction to statistical software.)

Running a Poisson regression

Suppose that you want to study the number of fatal highway accidents per year in a city. Table 19-1

shows some made-up fatal-accident data over the course of 12 years. Figure 19-1 shows a graph of

this data, created using the R statistical software package.